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FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file janlla.res made by on Wed 11 Jan 95 12:3£:38-PST. 



Query sequence being compared: 
Number of sequences searched: 
Number of scores above cutoff: 



CL16 <1-£1> 
302587 
4620 



Results of the initial comparison of CL16 <i-£i> with: 



Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 



EMBL-NEW 10, all entries 
GenBank 85, all entries 
GenBank-NEW 10, all entries 
HIV-NA 7, all entries 
Issued_NA , all entries 
N-6eneSeq 16.3, all entries 
UEMBL 40_85, all entries 
VectorBank 9, all entries 



PARAMETERS 



Similarity matrix Unitary 

Mismatch penalty 1 

Gap penalty 1,00 

Gap size penalty 0.33 

Cutoff score 1 

Randomization group 0 



K-tuple 

Joining penalty 
Window size 



4 
30 
14 



Initial scores to save 30 
Optimized scores to save 30 



Alignments to save 30 
Display context 100 



SEARCH STATISTICS 



Scores : 



Mean Median Standard Deviation 

6 7 £.92 



Times: 



CPU 
00: 13:01. 04 



Total Elapsed 
00:13:36.00 



Number of residues: £76734581 
Number of sequences searched: 302507 
Number of scores above cutoff: 46£0 



Cut-off raised to 4. 
Cut-off raised to 5. 
Cut-off raised to 6. 
Cut-off raised to 7. 
Cut-off raised to 8. 
Cut-off raised to 9. 
Cut-off raised to 10. 



Cut-off raised to 11. 
Cut-off raised to 12. 
Cut-off raised to 13. 
Cut-off raised to 14. 

The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

ft 100* identical sequence to the query sequence was not found. 



The list of best scores is: 



Sequence Name 



Description 



Init. Opt. 
Length Score Score Sig. Frame 



**** 3 standard deviations above mean **** 



1. 


XELRGASBC 


x. boreal is somatic 5s rrna ge 


375 


17 


£0 


3. 77 


0 


2. 


XELRGftSBft 


X.borealis somatic 5S rRNfi ge 


858 


17 


£0 


3. 77 


0 


3. 


XBRNA2 


Xenopus boreal is gene for 5S 


858 


17 


£0 


3.77 


0 


4. 


RATMftPlA 


Rat MftP-1 gene encoding major 


1402 


17 


17 


3. 77 


0 


5. 


RATTKG1 


Rat T-kininogen (T-KG) gene, 


1903 


17 


17 


3. 77 


0 


6. 


RNAMDX23 


R. norvegicus S-adenosy Iraethio 


2021 


17 


17 


3. 77 


0 


7. 


ZDHR6P 


2. diploperennis gene for hydr 


4478 


17 


17 


3. 77 


0 


a. 


MMT1CPS 


Mouse Tla region Tic pseudoge 


8147 


17 


17 


3.77 


0 


9. 


RflTSflDMEDC 


Rat PdoMetDC gene, complete C 


17167 


17 


17 


3. 77 


0 


10. 


CELFS8F5 


Caenorhabdit is elegans cosmid 


32903 


i / 


1 / 






11. 


CELF28F5 


Caenorhabdit is elegans cosmid 


32903 


17 


17 


3. 77 


0 


12. 


CHNTXX 


Tobacco chloroplast genome DN 


155844 


17 


17 


3.77 


0 


13. 


N60861 


Fragment of plasraid PXC204 en 


146 


16 


16 


3. 43 


0 


14. 


T24747 


EST322 Homo sapiens cDNft clon 


186 


16 


16 


3.43 


0 


15. 


HS7476 


EST322 Homo sapiens cDNft clon 


186 


16 


16 


3.43 


0 


16. 


NVIRGftfi 


Newt (Notophthalmus viridesce 


235 


16 


18 


3.43 


0 


17. 


NV5SRRN 


Notophthalmus viridescens 5S 


235 


1& 


18 


3. 43 


0 


18. 


N6086£ 


Fragment of plasmid PXC204 en 


288 


1& 


16 


3.43 


0 


19. 


XELCRLB 


Xenopus laevis caerulein prec 


301 


16 


16 


3. 43 


0 


£0. 


PfiBKTflNT 


BK virus 5' end of early regi 


332 


16 


16 


3.43 


0 


21. 


XLCAER1 


Xenopus laevis mRNA fragment 


370 


16 


16 


3.43 


0 


22. 


XELCRLfi 


Xenopus laevis caerulein prec 


370 


16 


16 


3.43 


0 


23. 


T08475 


EST06366 Homo sapiens cDNA cl 


383 


16 


16 


3. 43 


0 


24. 


XELCRLG35 


Xenopus laevis caerulein type 


391 


16 


16 


3.43 


0 


25. 


XELCRLI 


X. laevis caerulein mRNft, clon 


395 


16 


16 


3. 43 


0 


26. 


N&0858 


Sequence of plasmid PXC102 en 


397 


16 


16 


3.43 


0 


27. 


PVBRESWW 


Human papovavirus BK (strain 


426 


16 


16 


3. 43 


0 


28. 


HUMUTE361 


Human STS UT23&1. 


446 


16 


16 


3.43 


0 


29. 


BRRRPL37A 


Brassica rapa ribosomal prote 


446 


16 


16 


3. 43 


0 


30. 


N50145 


Sequence of enhancer DNft segm 


451 


16 


16 


3. 43 


0 


Query 


sequence being compared: CL16 (1-21) 











Number of sequences optimized: 



4620 



Results of the optimized comparison of CL16 <1-21) with: 
Data bank : EMBL-NEW 10, all entries 
Data bank : GenBank 85, all entries 
Data bank : GenBank-NEW 10, all entries 
Data bank : HIV-Nft 7, all entries 
Data bank : IssuedJMft , all entries 



Data bank : N-GeneSeq 16-3, all entries 
Data bank : UEMBL 40_85, all entries 
Data bank : VectorBank 9, all entries 



PARAMETERS 



Similarity matrix 


Unitary 


K-tuple 


4 


Mismatch penalty 


1 


Joining penalty 


30 


Gap penalty 


1-00 


Window size 


14 


Gap size penalty 


0. 33 






Cutoff score 


1 






Randomization group 


0 






Initial scores to save 30 


Alignments to save 30 




Optimized scores to 


save 30 


Display context 100 






SEARCH STATISTICS 




Scores : 


Mean 


Median Standard Deviation 




14 


15 0.80 




Times: 


CPU 


Total Elapsed 






00:01:06.99 00:01 : 38. 00 




Number of residues: 




24455601 




Number of sequences 


optimized: 


4620 




The scores below are 


sorted by 


optimized score. 





Significance is calculated based on optimized score. 

A 100# identical sequence to the query sequence was not found. 

The list of best scores is: 



Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 







*#*# 7 standard deviations above mean 


*«** 








1. 


XELRGflSBC 


x. borealis somatic 5s rrna ge 375 


17 


20 


7.51 


0 


2. 


XELRGflSBfi 


X. borealis somatic 5S rRNfl ge 858 


17 


20 


7.51 


0 


3. 


XBRN02 


Xenopus borealis gene for 5S 858 


17 


20 


7.51 


0 






**** 6 standard deviations above mean 


***# 








A. 


XELRGfiSL 


X. laevis somatic 5S rRNft gene 888 


16 


19 


6.26 


0 






**#* 5 standard deviations above mean 


**** 








5. 


NV5SRRN 


Notophthalmus viridescens 5S £35 


16 


18 


5.01 


0 


6. 


NVIRGflfl 


Newt (Notophthalmus viridesce 235 


16 


18 


5.01 


0 


7. 


XELRGftOB 


x. borealis oocyte 5s dna. 761 


15 


18 


5.01 


0 


8. 


XBRNftl 


Xenopus borealis genes (three 761 


15 


18 


5.01 


0 






#*** 3 standard deviations above mean 


*#** 








9. 


MMT1CPS 


Mouse Tla region Tic pseudoge 8147 


17 


17 


3.76 


0 


10. 


ZDHR6P 


Z. diploperennis gene for hydr 4478 


17 


17 


3.76 


0 


11. 


RflTTKGl 


Rat T-kininogen (T-KG) gene, 1903 


17 


17 


3.76 


0 


12. 


CELFS8F5 


Caenorhabditis elegans cosmid 32903 


17 


17 


3.76 


0 


13. 


RN0MDX23 


R. norvegicus S-adenosylmethio 2821 


17 


17 


3.76 


0 



14. 


CELF28F5 


Caenorhabdit is elegans cosniid 


^d90^J 


17 


17 


3. 


76 


0 


15. 


RATMAP1A 


Rat MAP-1 gene encoding major 


14®^ 


17 


17 


-> 

3. 


76 


0 


16. 


CHNTXX 


Tobacco chloroplast genome DN 


155844 


17 


17 


3. 


76 


0 


17. 


RATSADMEDC 


Rat AdoMetDC gene, complete C 


17167 


17 


17 


3. 


76 


0 


18. 


ONHGHCQHO 


Oncorhynchus kisutch (coho sa 


1201 


15 


17 


3. 


76 


0 


19. 


ST RE I PEP A 


Streptococcus salivarius phos 


2259 


15 


17 


3. 


76 


0 


20. 


XLXK70A 


Xenopus laevis XK70A gene for 


6266 


15 


17 


3. 


76 


0 


21. 


CEF54C8 


Caenorhabditis elegans cosmid 


23000 


15 


17 


3. 


76 


0 


22. 


CEF54C8 


Caenorhabdit is elegans cosmid 


23000 


15 


17 


3. 


76 


0 


23. 


MIOACYTB 


□. aries mitochondrion cytb ge 


1140 


13 


17 


3. 


76 


0 


24. 


SV4EV211 


SV4© variant genome ev-2114, 


100 


14 


17 


3. 


76 


0 


25. 


DR07DC142 


Drosophila melanogaster (subc 




1.J 


1 / 




fb 




26. 


T 10577 


hbc220 Homo sapiens cDNA clon 


560 


13 


17 


3. 


76 


0 


27. 


NEUFRG 


Neurospora crassa mRNA sequen 


4631 


13 


17 


3. 


76 


0 


28. 


HSA26A071 


H. sapiens partial cDNfl seque 


347 


13 


17 


3- 


76 


0 


29. 


HSA39H101 


H. sapiens partial cDNA seque 


345 


12 


17 


3- 


76 


0 


30. 


PCT-US93-04648-1 Sequence 15, Application 


10596 


14 


17 


3. 


76 


0 



1. CL16 (1-21) 

XELRGASBC x. boreal is somatic 5s rrna gene, clone pxbsf201. 

LOCUS XELRGASBC 375 bp ds-DNA VRT 05-JUN-1991 

DEFINITION x. boreal is somatic 5s rrna gene, clone pxbsf20i. 
ACCESSION K01537 

KEYWORDS 5S ribosomal RNA; ribosomal RNA. 
SOURCE xenopus boreal is dna, clone pxbsf201. 

ORGANISM Xenopus laevis 

Eukaryota; Aniraalia; Chordata; Vertebrata; Amphibia; Lissamphibia; 
Anura; Archeobatrachia; Pipoidea; Pipidae; Xenopodinae. 
REFERENCE 1 (bases 1 to 375) 

AUTHORS Razvi,F., Gargiulo,G. andWorcel,A. 

TITLE a simple procedure for parallel sequence analysis of both strands 

of 5 1 -labeled dna 
JOURNAL Gene 23, 175-183 (1983) 
STANDARD full automatic 
COMMENT NCBI gi: 214699 
FEATURES Locat i on/Qua 1 i f i er s 

source 1..375 

/ organ ism= n Xenopus laevis" 
misc_feature complement (1. .29) 

/note="putative VECTOR sequence Vector pUC19 (M11662); 
put at i ve" 
rRNA 80. . 199 

/note="5s rrna" 
misc_feature 286. . 375 

/note="putat i ve VECTOR sequence Bacteriophage M13mpl8 
(Ml 1454) ; putative" 
BASE COUNT 80 a 116 c 96 g 83 t 

ORIGIN 2 bp upstream of alui site. 

Initial Score = 17 Optimized Score = 20 Significance = 7.51 

Residue Identity = 95% Matches = 20 Mismatches = 1 

Gaps = 0 Conservative Substitutions = 0 

CATACCACCCTGAAAGTGCCCGATATCGTCTGATCTCGGAAGCCAAGCAGGGTCGGGCCTGGTTAGTACTTG 
90 100 110 120 130 140 } 150 160 



X 10 X 

GTCCTflGGCTTTTGCACTTTT 
III I I I I I I I I I I I I I I I I I 
GftTGGGAGftCCGCCTGGGftflTfiCCftGGTGTCGTflGGCTTTTGCflCTTTTGCCftTTCTGflGTflflCfiGCfiGGGG 
170 180 190 £00 £10 220 £30 

GCfiGTCTCCTCCftTGCflTTTTTCTTTCCCCGflflCflGCCGGfiTCCCCGGGfiflTTCflCTGGCCGTCGTTTTflCfi 
£40 £50 £60 £70 £80 £90 300 



flCGTC 
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FastDB - Fast Pairwise Comparison of Sequences 
Release 5. 4 

Results file janlib.res made by on Wed 11 Jan 95 1£: 32:44-PST. 



Query sequence being compared: 
Number of sequences searched: 
Number of scores above cutoff: 



CL17 (1-38) 
302507 
4183 



Results of the initial comparison of CL17 (1-38) with: 



Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 



EMBL-NEW 10, all entries 
GenBank 85, all entries 
GenBank-NEW 10, all entries 
HIV-NA 7, all entries 
IssuedJMA , all entries 
N-GeneSeq lfc.3, all entries 
UEMBL 40_85, all entries 
VectorBank 9, all entries 



PARAMETERS 



Similarity matrix 


Unitary 


K-tuple 


4 


Mismatch penalty 


1 


Joining penalty 


30 


Gap penalty 


1.00 


Window size 


14 


Gap size penalty 


0. 33 






Cutoff score 


£ 






Randomization group 


0 






Initial scores to save 


30 


Alignments to save 


30 


Optimized scores to save 30 


Display context 


100 




SEARCH STATISTICS 




Scores : 


Mean 


Median Standard 


Deviat ion 




9 


10 4.78 




Times: 


CPU 


Total Elapsed 



00: 13:£3. 07 



00: 13:42. 00 



Number of residues: 876734581 
Number of sequences searched: 302507 
Number of scores above cutoff: 4183 



Cut-off 
Cut-off 
Cut-off 
Cut-off 
Cut-off 
Cut-off 
Cut-off 
Cut-off 



raised to 4. 
raised to 5. 
raised to 6. 
raised to 7. 
raised to 9. 
raised to 11. 
raised to 13. 
raised to 14. 



Cut-off raised to 15. 
Cut-off raised to 16. 
Cut-off raised to 17. 
Cut-off raised to 18. 
Cut-off raised to 19. 
Cut-off raised to 20. 
Cut-off raised to £1. 

The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

ft 100"/ identical sequence to the query sequence was not found. 



The list of best scores is: 

Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



3 standard deviations above mean 



1. 


HSU09850 


Human zinc finger protein (ZN 


3908 


28 


30 


3.98 


0 


2. 


USMURBS1A 


Ustilago raaydis URBS1 protein 


3987 


28 


31 


3.98 


0 


3. 


S76114 


<right virus-host integration 


569 


£7 


29 


3.77 


0 


4. 


0CACE3P 


O.cuniculus DNA for angiotens 


978 


27 


28 


3.77 


0 


5. 


HUMCOUPII 


Horao sapiens chick ovalbumin 


2268 


27 


28 


3. 77 


0 


6. 


HSCOUPII 


Homo sapiens chick ovalbumin 


2268 


27 


28 


3.77 


0 


7. 


PALHISH2H3 


P. lividus histone H3 and H2A 


2291 


27 


29 


3.77 


0 


a. 


RABACEA 


Rabbit angiotensin converting 


2409 


27 


28 


3.77 


0 


9. 


MMSK5 


Mouse glandular kallikrein ge 


3610 


27 


30 


3.77 


0 


10. 


LUMHBC 


Earthworm (L. terrestris) extr 


4037 


27 


31 


3.77 


0 


11. 


MMIFOR 


M. musculus mRNA for formin (i 


4241 


27 


30 


3.77 


0 


12. 


OCANCOE 


O.cuniculus mRNA for angioten 


4800 


£7 


28 


3.77 


0 


13. 


MMLDF 


M. musculus limb deformity mRN 


4973 


£7 


30 


3.77 


0 


14. 


CELB0280 


Caenorhabdit is elegans cosmid 


41088 


£7 


28 


3.77 


0 


15. 


CEB0280 


Caenorhabditis elegans cosmid 


41088 


£7 


28 


3.77 


0 


16. 


RICR20321A 


Rice cDNA, partial sequence ( 


£71 


£6 


29 


3.56 


0 


17. 


T21884 


3892 Arabidopsis thaliana cDN 


£78 


£6 


26 


3.56 


0 


18. 


RIC1140A 


Rice cDNA, partial sequence < 


353 


£6 


£8 


3.56 


0 


19. 


T09049 


EST06941 Homo sapiens cDNA cl 


394 


£6 


£7 


3. 56 


0 


20. 


RATMLCB 1 


Rat cardiac myosin light chai 


549 


£6 


28 


3.56 


0 


21. 


HUMITILC03 


Human inter-alpha-trypsin inn 


618 


£6 


£9 


3.56 


0 


22. 


HUMMHDVB2 


Human MHC class II HLA-DV-bet 


745 


26 


£9 


3. 56 


0 


23. 


MUSNAKATPQ 


Mouse Na, K-ATPase beta£ subun 


1 128 


26 


29 


3.56 


0 


24. 


PSELINC 


P. pauciroobilis linC gene for 


1148 


26 


26 


3.56 


0 


25. 


PPL INC 


P. paucimobilis linC gene for 


1148 


26 


26 


3. 56 


0 


26. 


BOVPROA 


Bovine protamine gene PI alle 


1340 


26 


27 


3.56 


0 


£7. 


BOVPROB 


Bovine protamine gene PI alle 


1369 


26 


27 


3. 56 


0 


28. 


HUMMHDQBAA 


Human MHC class II HLA-DQB3 p 


1416 


26 


£9 


3. 56 


0 


29. 


MUSIGHYC1 


Mouse Ig heavy-chain variable 


1599 


26 


30 


3.56 


0 


30. 


MMIGVH28 


Mouse immunoglobulin J558 V<H 


1599 


26 


30 


3.56 


0 



Query sequence being compared: CL17 <l-38) 

Number of sequences optimized: 4183 

Results of the optimized comparison of CL17 (1-38) with: 
Data bank : EMBL-NEW 1©, all entries 
Data bank : GenBank 85, all entries 



Data bank : GenBank-NEW 10, all entries 

Data bank : HIV-NA 7, all entries 

Data bank : IssuedJMA , all entries 

Data bank : N-GeneSeq 16.3, all entries 

Data bank : UEMBL 4©_85 5 all entries 

Data bank : VectorBank 9, all entries 



PARAMETERS 



Similarity matrix Unitary 

Mismatch penalty 1 

Gap penalty 1.00 

Gap size penalty 0.33 

Cutoff score 2 

Randomizat ion group 0 



K-tuple 

Joining penalty 
Window size 



4 
30 
14 



Initial scores to save 30 
Optimized scores to save 30 



Alignments to save 30 
Display context 10© 



SEARCH STATISTICS 



Scores: 



Mean Median Standard Deviation 

24 £6 1.45 



Times : 



CPU 
00:00:49.98 



Total Elapsed 
00:01 : 03. 00 



Number of residues: 16505319 
Number of sequences optimized: 4183 



The scores below are sorted by optimized score. 
Significance is calculated based on optimized score. 

A 100*>4 identical sequence to the query sequence was not found. 



The list of best scores is: 

Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



**** 4 standard deviations above mean **** 



1. 


SHPIGFIIfl 


Ovis aries insulin-like growt 


1036 


£5 


31 


4.82 


0 


s. 


USMURBSlfl 


Ustilago roaydis URBS1 protein 


3987 


£8 


31 


4.8£ 


0 


3. 


Q61404 


Human brain Expressed Sequenc 


361 


£3 


31 


4.8£ 


0 


4. 


M79245 


EST01393 Homo sapiens cDNfl cl 


361 


£3 


31 


4.82 


0 


5. 


HSCD19 


H. sapiens RNfl for CD19. 


191© 


£1 


31 


4.8£ 


0 


6. 


HUMCD19W01 


Human CD19 gene, exons 1-4. 


1916 


£1 


31 


4.8£ 


0 


7. 


N90612 


CD 19 cDNfl. 


19£1 


21 


31 


4.8£ 


0 


B. 


Q2117S 


Human C019 antigen coding seq 


1922 


21 


31 


4.8£ 


0 


9. 


LUMHBC 


Earthworm (L. terrestris) extr 


4037 


27 


31 


4.82 


0 


10. 


0ANIGFII4 


□vis aries insulin-like growt 


547 


25 


31 


4.82 


0 


11. 


HUMCD19G 


Human C019 gene, complete cds 


8743 


21 


31 


4.82 


0 


IS. 


HUMANTCD 


Human differentiation antigen 


19££ 


21 


31 


4.82 


0 


13. 


HUMCSPC 


Human cell surface protein CD 


£096 


21 


31 


4.82 


0 




14. 


riA T M T f2C T T A 

UR I IN lor 114 


Ovis aries insulin-like growt 


CAT 
*J*f / 






A 


AO 




i tr 


MPlLUr 


M. musculus limb deformity mRN 


AQ77 


C f 


*5to 


A 


1 *? 




lb. 




Mouse glandular kallikrein ge 


~2r i ft 


0~7 
C / 




A 
*t. 


X O 


l7l 


1 /. 


MM T CflD 


M. musculus mRNA for formin (i 


AO A 1 


c / 


"701 


A 


1 7 




lo. 




Human zinc finger protein (ZN 


"7QI7.Q 


OQ 
CO 


"Jft 


A 


1 "7 
1 O 


171 


1*7. 




o4o-*3 bequence t5 5 Hpplication Ub 


1 "7 A 1 f7t 
1 /41 W 


cl 


»3t0 


A 




(0 




HSTUBRb 


Human gene for alpha-t ubul in 


/. (710*7 

4(0o / 


OA 

c4 




A 


1 "7 


i7i 


dl . 


RNhId 


Rat nHNH for the alpha-IB adr 


etOob 


O i 

cl 


-r>ft 


A 

4. 


1 "7 


10 


2d. 


RATRDLX 


Rat homeoprotein (rDlx) raRNR, 


4 *7Qft 


O 1 

cl 




4. 




to 




DMLABR 


Drosophila melanogaster F24 m 


CliJ^ 


OCT 

CJ 




A 

4. 


i -7 
1 ^ 


ft 
10 


£4. 


Q5*5l4£ 


Sequence encoding osteogenic 


1 7/. 4 ft 

1 /41(0 


o * 

cl 


*7ft 

*itO 


4. 


1 *7 


to 


OCT 

CJ> 




rumuscuius genes nuA H«*t ano 


ODW 1 


OA 
C*r 




*+» 






£6. 


DMLABG1 


Drosophila melanogaster F£4 1 


1846 


£5 


30 


4. 


13 


0 


£7. 


RATGENOME 


Rat gene for alpha IB adrener 


£387 


£1 


30 


4. 


13 


0 


£8. 


0SRGP1 


Rice rgpl mRNA for a ras-rela 


1303 


££ 


30 


4. 


13 


0 


£9. 


MUSIGHYC1 


Mouse Ig heavy-chain variable 


1599 


£fe 


30 


4. 


13 


0 


38. 


MMIGMH28 


Mouse immunoglobulin J558 V(H 


1599 


£6 


30 


4. 


13 


0 



1. CL17 (1-38) 
5HPIGFIIA 



Ovis aries insulin-like growth factor II (I6F-II) 



LOCUS 

DEFINITION 

ACCESSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
STANDARD 

COMMENT 

FEATURES 

source 



sig^pept ide 



CDS 



SHPIGFIIA 1®36 bp ss-raRNA MAM 22-JUL-1993 

Ovis aries insulin-like growth factor II (IGF-II) mRNA, complete 
cds. 
M89788 

insulin-like growth factor II. 

Ovis aries (strain Coopworth) (library: random primed cDNA) lamb 
liver cDNA to mRNA. 
Ovis aries 

Eukaryota; Animalia; Chordata; Vertebrata; Mammalia; Theria; 
Eutheria; Art iodactyla; Ruroinantia; Pecora; Bovidae. 
1 (bases 1 to 1036) 

Demmer,J., Hill,D.F. and Petersen, G. B. 

Characterization of two sheep insulin-like growth factor II cDNAs 
with different 5* -untranslated regions 
Biochim. Biophys. Acta 1173, 79-80 (1993) 
full automatic 
NCBI gi: 165940 

Locat i on/Qua 1 i f i er s 

1.. 1036 

/organisra= l, 0vi5 aries" 
/strain= ,, Coopworth u 
/dev_5tage= M lamb" 
/sequenced_mol="cDNA to mRNA" 
/t issue_type= M l i ver" 
/tissue_lib =,, random primed cDNA" 
10£. . 173 
/gene=" IGF-II" 
/codon - _start = l 
102. .641 
/gene="IGF-II" 
/note="NCBI gi: 165941" 
/codon__start=l 

/product=" insulin-like growth factor II" 

/ 1 ran s 1 at i on= "MG I TAGKSMLALLAFLAFASCCYAAYRPSETLCGGEL VDTLQF V 
CGDRGFYFSRPSSRINRRSRGIMEECCFRSCDLALLETYCAAPAKSERDVSASTTVLP 



DDFTOYPVGKFFQSDTWKQSTQRLRRGLPfiFLRORRGRTLftKELEfiLREflKSHRPLIfi 
LPTQDPflTHGGOSSEfiSSD " 
mat_peptide 174. .374 

/gene="IGF-II" 
/codon_start=l 

/product=" insulin-like growth factor II" 
BASE COUNT £20 a 368 c £36 g £1£ t 

ORIGIN 

Initial Score = £5 Optimized Score = 31 Significance = 4. 82 
Residue Identity = 80% Matches = 33 Mismatches = 5 

Gaps = 3 Conservative Substitutions = ® 

X 10 £0 30 X 

TC— GfiCTCCTCTTCCTCCTCCOCCTCCTCCTCC-CfiTGCfl 
II I I I I I I I I I I I I I I I I I I I II I I I III I I II 
GGTfiGCTTCTCCTCGGAGGCftGCCTTCCflGflCTCCTCCTCCTCCTCCTCCTCCTCflTCCTCCTTCflGCCCCfl 
10 £0 X 30 40 50 60 X 70 

GCGftGCCTCCTGTCCftGCTGCAGflCftTCflATGGGGftTCflCflGCflGGflP.flGTCGftTGCTGGCGCTTCTTGCCT 
80 90 100 110 1£0 130 140 

TCTTGGCCTTCGCCTC6TGCTG 
150 160 
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FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file janllc.res made by on Wed 11 Jan 95 12:47:51-PST. 



Query sequence being compared: CL26 (1-21) 

Number of sequences searched: 302587 
Number of scores above cutoff: 462© 

Results of the initial comparison of CL26 (1-21) with: 

Data bank : EMBL-NEW 1®, all entries 

Data bank : GenBank 85, all entries 

Data bank : GenBank-NEW 1©, all entries 

Data bank : HIV-NA 7, all entries 

Data bank : Issued_NA , all entries 

Data bank : N-GeneSeq 16.3, all entries 

Data bank : UEMBL 4©_85, all entries 

Data bank : MectorBank 9, all entries 



Similarity matrix 

Mismatch penalty 

Gap penalty 

Gap size penalty 

Cutoff score 

Rand omizat ion group 



PARAMETERS 



Unitary 
1 

1.00 

©. 33 
1 



Initial scores to save 
Optimized scores to save 



30 
30 



K-tuple 

Joining penalty 
Window size 



SEARCH STATISTICS 



4 
30 
14 



Alignments to save 3© 
Display context 10© 



Scores : 



Mean Median Standard Deviation 

6 7 2.92 



Times : 



CPU 
12:46. ©3 



Total Elapsed 
00:12:48. 0© 



Number of residues: 276734581 
Number of sequences searched: 302507 
Number of scores above cutoff: 462© 



Cut-off raised to 4. 
Cut-off raised to 5. 
Cut-off raised to 6. 
Cut-off raised to 7. 
Cut-off raised to 8. 
Cut-off raised to 9. 
Cut-off raised to 10. 
Cut-off raised to 11. 



Cut-off raised to 12. 
Cut-off raised to 13. 
Cut-off raised to 14. 



The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

A 1WA identical sequence to the query sequence was not found. 
The list of best scores is: 

Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



#*** 3 standard deviations above mean **** 



1. 


XELRGASBC 


x.borealis somatic 5s rrna ge 


375 


17 


£0 


3. 77 


0 


a. 


XELRGASBA 


X.borealis somatic 5S rRNA ge 


858 


17 


20 


3.77 


0 


3. 


XBRNA2 


Xenopus boreal is gene for 5S 


858 


17 


20 


3. 77 


0 


4. 


RATMAP1A 


Rat MAP-1 gene encoding major 


1402 


17 


17 


3.77 


0 


5. 


RATTKG1 


Rat T-kininogen (T-KG) gene, 


1903 


17 


17 


3. 77 


0 


&. 


RNAMDX23 


R.norvegicus S-adenosylmethio 


2021 


17 


17 


3.77 


0 


7. 


ZDHRGP 


Z.diploperennis gene for hydr 


4478 


17 


17 


3.77 


0 


8. 


MMT1CPS 


Mouse Tla region Tic pseudoge 


8147 


17 


17 


3.77 


0 


9. 


RATSADMEDC 


Rat AdoMetDC gene, complete C 


17167 


17 


17 


3. 77 


0 


10. 


CELF28F5 


Caenorhabditis elegans cosmid 


32903 


17 


17 


3.77 


0 


11. 


CELF28F5 


Caenorhabditis elegans cosmid 


32903 


17 


17 


3.77 


0 


12. 


CHNTXX 


Tobacco chloroplast genome DN 


155844 


17 


17 


3.77 


0 


13. 


N&08&1 


Fragment of plasmid PXC204 en 


146 


16 


16 


3.43 


0 


14. 


T24747 


EST322 Homo sapiens cDNA clon 


186 


16 


16 


3.43 


0 


15. 


HS7476 


EST322 Homo sapiens cDNA clon 


186 


16 


16 


3. 43 


0 


16. 


NVIRGAA 


Newt (Notophthalmus viridesce 


£35 


16 


18 


3.43 


0 


17. 


NV5SRRN 


Notophthalmus viridescens 5S 


£35 


16 


18 


3. 43 


0 


18. 


N&0862 


Fragment of plasm id PXC204 en 


£88 


16 


16 


3.43 


0 


19. 


XELCRLB 


Xenopus laevis caerulein prec 


301 


16 


16 


3.43 


0 


£0. 


PABKTANT 


BK virus 5' end of early regi 


332 


16 


16 


3. 43 


0 


21. 


XLCAER1 


Xenopus laevis mRNA fragment 


370 


16 


16 


3.43 


0 


22. 


XELCRLA 


Xenopus laevis caerulein prec 


370 


16 


16 


3.43 


0 


23. 


T08475 


EST06366 Homo sapiens cDNA cl 


383 


16 


16 


3.43 


0 


24. 


XELCRLG35 


Xenopus laevis caerulein type 


391 ■ 


16 


16 


3. 43 


0 


25. 


XELCRLI 


X. laevis caerulein mRNA, clon 


395 


16 


16 


3.43 


0 


26. 


N60858 


Sequence of plasmid PXC102 en 


397 


16 


16 


3. 43 


0 


27. 


PVBRESWW 


Human papovavirus BK < strain 


426 


16 


16 


3.43 


0 


28. 


HUMUT236 1 


Human STS UT2361. 


446 


16 


16 


3.43 


0 


29. 


BRRRPL37A 


Brassica rapa ribosomal prote 


446 


16 


16 


3. 43 


0 


30. 


N50145 


Sequence of enhancer DNA segm 


451 


16 


16 


3.43 


0 



Query sequence being compared: CL26 (1-21) 

Number of sequences optimized: 462© 

Results of the optimised comparison of CL2& (1-21) with: 

Data bank : EMBL-NEW 10, all entries 

Data bank : GenBank 85, all entries 

Data bank : GenBank-NEW 10, all entries 

Data bank : HIV-Nfi 7, all entries 

Data bank : Issued_NA , all entries 

Data bank : N-GeneSeq 16.3, all entries 




Data bank : UEMBL 40_85, all entries O&Y . ?zSo<-> I 

Data bank : VectorBank 9, all entries — 



PARAMETERS 



Similarity matrix 


Unitary 


K-tuple 




4 


Mismatch penalty 


1 


Joining penalty 




30 


Gap penalty 


1.(30 


Window size 




14 


Gap size penalty 


0.33 








Cutoff score 


1 








Random izat ion group 


0 








Initial scores to save 


30 


Alignments to save 


30 




Optimized scores to save 3© 


Display context 


100 





SEARCH STATISTICS 



Scores: Mean Median Standard Deviation 

14 15 0-80 

Times: CPU Total Elapsed 

00s 01 :01. 97 00:01 :07. 00 

Number of residues: £4455601 
Number of sequences optimized: 4620 



The scores below are sorted by optimized score. 
Significance is calculated based on optimized score. 

A 100% identical sequence to the query sequence was not found. 
The list of best scores is: 



Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 







**#* 7 standard deviations 


above mean 


**#* 








1. 


XELRGASBC 


x.borealis somatic 5s rrna ge 


375 


17 


£0 


7.51 


0 


2. 


XELRGASBft 


X. boreal is somatic 5S rRNfl ge 


858 


17 


20 


7.51 


0 


3. 


XBRNA2 


Xenopus boreal is gene for 5S 


858 


17 


20 


7.51 


0 






**** 6 standard deviations 


above mean 


***# 








4. 


XELRGASL 


X.laevis somatic 5S rRNft gene 


888 


16 


19 


6.26 


0 






***# 5 standard deviations 


above mean 


**** 








5. 


NV5SRRN 


Notophthalmus viridescens 5S 


£35 


16 


18 


5.01 


0 


6. 


NVIRGfift 


Newt <Notophthalmus viridesce 


£35 


16 


18 


5.01 


0 


7. 


XELRGAOB 


x.borealis oocyte 5s dna. 


761 


15 


18 


5.01 


0 


8. 


XBRNfll 


Xenopus borealis genes (three 


761 


15 


18 


5.01 


0 






*#** 3 standard deviations i 


above mean 


#*** 








9. 


MMT1CPS 


Mouse Tla region Tic pseudoge 


8147 


17 


17 


3.76 


0 


10. 


ZDHRGP 


Z. diploperennis gene for hydr 


4478 


17 


17 


3. 76 


0 


11. 


RfiTTKGl 


Rat T-kininogen (T-KG) gene, 


1983 


17 


17 


3.76 


0 


12. 


CELFS8F5 


Caenorhabditis elegans cosmid 


32903 


17 


17 


3.76 


0 



13. RIM8MDX23 

14. CELF28F5 



R.norvegicus S-adenosylraethio 2021 17 17 3.76 © 
Caenorhabdit is elegans cosmid 32903 17 17 3.76 © 



1 

1 J. 


DQTMQD 1 Q 
KH l i T IHH 1 H 


nao ImHH 1 gene encoulny major 


1 A(7lP 


1 7 
1 / 


1 7 


7 7A 


ft 




rUMTYY 


i ouaCLu cniorupia5 v genuine uih 




1 7 


1 7 


*? 7f. 




1 7 
if. 


KH 1 DHUI'ltUL 


D -i 4- flHftMflf HP nana r-> m i~» 1 n f" 1 

not HuonetUb gene., complete L/ 


171 £7 




1 7 


■7 7£L 


ft 


1 A 


nwHRWPnMn 
uiNnonLfUnLi 


I - ) i~v /-> r\ w f> Kmc 1/ •» p iif r-\ l~i / r^nhn c 3 

uncornynwrius Kisutt.n vcrono 5a 


1 O0I1 

1 cwi 




1 7 


*^ 7A 


ft 


1 Q 




C^* v% o T" r% /-\ j-^ t i r* *r ^ 1 i « i *3 lite; ^ r** c~ 

ourcptococcus Sal i vanus pnos 


C1-D7 




1 7 


"? 7A 


ft 




Yl YK7ftO 


a en opus laevis Ai\(vr\ gene Tor 


DlDD 




1 7 


"? 7f» 


ft 


O 1 


Ltr J*tLo 


Laenor nduai t is eiegans cosmiu 






1 7 


7C* 


ft 


oo 

CCo 


LrC,rD*tLfO 


LdenornaDQiiis eiegans cosniio 




ID 


1 7 


*? 7f. 




O"? 
CJ. 




Ui arlcS hi 1 v OCnOnur 1 On Cy tu ge 




1 0 


1 7 


7A 


ft 


OA 




ovh^lJ vflnani genome ev ciih, 


1 01(71 


1 A 


1 7 
1 f 


7A 






DR07DC14Z 


Drosophila raelanogaster (subc 






17 


7fi 


ft 


£6. 


T 18577 


hbc££0 Homo sapiens cDNA clon 


560 


13 


17 


3.7fe 


0 


£7. 


NEUFRG 


Neurospora crassa raRNA sequen 


4631 


13 


17 


3.76 


0 


£8. 


HSA2&A071 


H. sapiens partial cDNA seque 


347 


13 


17 


3.76 


0 


£9. 


HSA39H101 


H. sapiens partial cDNA seque 


345 


1£ 


17 


3.76 


0 


3®. 


PCT-US93-04648-1 Sequence 15, Application 


10596 


14 


17 


3.76 


0 



1. CL26 (1-21) 

XELRGASBC x.borealis somatic 5s rrna gene, clone pxbsf£®i. 



LOCUS 

DEFINITION 
ACCESSION 
KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
STANDARD 

COMMENT 

FEATURES 

source 



misc feature 



rRNA 



raise feature 



XELRGASBC 375 bp ds-DNA VRT 05-JUN-1991 

x.borealis soniatic 5s rrna gene, clone pxbsf£®l. 

K01537 

5S ribosooial RNA; ribosomal RNA. 
xenopus borealis dna, clone pxbsfSOl. 
Xenopus laevis 

Eukaryota; Animalia; Chordata; Vertebrata; Amphibia; Lissamphibia 

Anura; Archeobatrachia; Pipoidea; Pipidae; Xenopodinae. 

1 (bases 1 to 375) 

Razvi,F. , Gargiulo,G. and Worcel,A. 

a simple procedure for parallel sequence analysis of both strands 

of 5 1 -labeled dna 

Gene £3, 175-183 (1983) 

full automatic 

NCBI gis £14699 

Locat ion/Qual if iers 

1..375 

/ organ ism=" Xenopus laevi 5" 
complement ( 1. . £9) 

/note="putative VECTOR sequence Vector pUC19 (M1166£); 
putat ive" 
80. . 199 

/note="5s rrna" 
£86.. 375 

/note="putat i ve VECTOR sequence Bacteriophage M13mpi8 



(M11454) ; putative" 
BASE COUNT 80 a 116 c 96 g 

ORIGIN £ bp upstream of alui site. 



83 t 



Initial Score = 
Residue Identity = 
Gaps = 



17 Optimized Score = £0 Significance = 7.51 
95* Matches = £0 Mismatches = 1 

0 Conservative Substitutions = © 



CATACCACCCTGAAAGTGCCCGATATCGTCTGATCTCGGAAGCCAAGCAGGGTCG6GCCTGGTTAGTACTTG 
90 100 110 1£0 130 140 150 160 



X 10 X 

GTCCTAGGCTTTT6CACTTTT 
III I I I I I I I I I I I I I I I I I 
GfiTGGGRGfiCCGCCTGGGflRTflCCflGGTGTCGTflGGCTTTTGCflCTTTTGCCflTTCTGflGTflACflGCftGGGG 
17® 18® 190 200 210 £20 £3© 

GCflGTCTCCTCCfiTGCflTTTTTCTTTCCCCGftflCAGCCGGOTCCCCGGGfiOTTCflCTGGCCGTCGTTTTftCfl 
240 £50 260 270 £80 290 300 

ACGTC 
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FastDB - Fast Pairwise Comparison of Sequences 
Release 5. A 

Results file janild.res made by on Wed 11 Jan 95 13:02:40-PST. 



Query sequence being compared: 
Number of sequences searched: 
Number of scores above cutoff: 



CLlfe' 



(1-21) 
3025S7 
4881 



Results of the initial comparison of CLlfc* (1-21) with: 

Data bank : EMBL-NEW 10, all entries 

Data bank : GenBank 85, all entries 

Data bank : GenBank-NEW 10, all entries 

Data bank : HIV-NA 7, all entries 

Data bank : IssuedJMA , all entries 

Data bank : N-6eneSeq 16, 3, all entries 

Data bank : UEMBL 4©_85, all entries 

Data bank : VectorBank 9, all entries 

PARAMETERS 



Similarity matrix 


Unitary 


K-tuple 




4 


Mismatch penalty 


1 


Joining penalty 




30 


Gap penalty 


1.00 


Window size 




14 


Gap size penalty 


0. 33 








Cutoff score 


1 








Randomization group 


0 








Initial scores to save 


30 


Alignments to save 


30 




Optimized scores to save 30 


Display context 


100 





SEARCH STATISTICS 



Scores: 



Mean 
6 



Med i an Standard Dev iat ion 
7 3-01 



Times: 



CPU 
>:13:19.97 



Total Elapsed 
00:13:35.00 



Number of residues: 276734581 
Number of sequences searched: 302507 
Number of scores above cutoff: 4881 



Cut-off 
Cut-off 
Cut-off 
Cut-off 
Cut-off 
Cut-off 
Cut-off 
Cut-off 
Cut-off 



raised to 4. 
raised to 5. 
raised to fc. 
raised to 7. 
raised to 8. 
raised to 9. 
raised to 10. 
raised to 11. 
raised to 12. 



Cut-off raised to 13. 
Cut-off raised to 14. 

The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

A 1®0"/- identical sequence to the query sequence was not found. 
The list of best scores is: 

In it. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



**** 3 standard deviations above mean ***# 



1. 


BTRPTDNAE 


B. taurus repeat region DNA. 


48c! 


18 


18 


6. 99 


0 


s. 


RABTCRGAM 


Rabbit T-cell receptor gamma 


147 


17 


17 


3. 65 


0 


3. 


Q77574 


Human genome fragment. (Prefe 


£00 


17 


17 


3. 65 


0 


4. 


HSAAACMHG 


H. sapiens putatively transcr 


£00 


17 


17 


3. 65 


0 


5. 


ATTS1638 


A. thaliana transcribed seque 


£74 


17 


17 


3. 65 


0 


6. 


TBILTA124 


T.brucei raRNA for variant sur 


1688 


17 


17 


3. 65 


0 


7. 


TBRVSG 


T. brucei rhodensiense mRNA fo 


173£ 


17 


17 


3. 65 


0 


6. 


U01312 


Streptococcus pyogenes JRS4 p 


1823 


17 


17 


3.65 


0 


9. 


S5£56£ 


LH-£=LIM/homeodomain protein 


£07£ 


17 


17 


3. 65 


0 


10. 


HUMSWX167 


Human chromosome X STS swXDlb 


£39 


16 


17 


3. 3£ 


0 


11. 


GCREG35 


Galago Alu repeat type II, GA 


£45 


16 


16 


3.3£ 


0 


12. 


HSA1£7WB5 


H. sapiens (D1S505) DNA segme 


319 


16 


16 


3.3£ 


0 


13. 


NEUMT0LI£ 


N. crassa mitochondrial oli£ 


335 


16 


16 


3. 3£ 


0 


14. 


PLYORIA 


Human polyomavirus BK (strain 


375 


16 


16 


3.32 


0 


15. 


M88810 


CEL01E12 Caenorhabditis elega 


394 


16 


16 


3. 3£ 


0 


16. 


S675£3 


early gene, late gene -Ccontro 


401 


16 


16 


3. 3£ 


0 


17. 


ATTS££83 


A. thaliana transcribed seque 


408 


16 


16 


3. 3£ 


0 


18. 


HS311VF9 


H. sapiens (D5S662) DNA segme 


414 


16 


16 


3. 32 


0 


19. 


PLYGRIB 


Human polyomavirus BK (strain 


4£4 


16 


16 


3. 32 


0 


£0. 


ATTS188£ 


A. thaliana transcribed seque 


4£9 


16 


16 


3. 32 


0 


SI. 


PVBECR5S2 


Human papovavirus BK, Gardner 


455 


16 


16 


3. 32 


0 


£2. 


SYNECR530 


BKV hybrid (tr-530) early tra 


487 


16 


16 


3.32 


0 


£3. 


SYNECR53£ 


BKV hybrid (tr-53£> early tra 


515 


16 


16 


3.3£ 


0 


24. 


SYNECR531 


BKV hybrid (tr-531) early tra 


558 


16 


16 


3.3£ 


0 


£5. 


PVBECR501 


Human papovavirus BK, Gardner 


559 


16 


16 


3. 32 


0 


26. 


CEZMTTGP 


Green turtle mitochondrion tr 


620 


16 


16 


3. 32 


0 


27. 


G58456 


BK enhance* — adenovirus-iB late 


64£ 


16 


16 


3. 32 


0 


28. 


Q54£10 


BK enhancer-adenovirus £ late 


642 


16 


16 


3.32 


0 


29. 


HUMRPO 


Human gene for ret proto-onco 


678 


16 


16 


3. 32 


0 


30. 


ZEFTRANB 


Danio rerio mRNA, Tcl-like tr 


708 


16 


16 


3.3£ 


0 



Query sequence being compared: CL16' (1-21) 

Number of sequences optimised: 4881 

Results of the optimized comparison of CL16 1 (1-£1) with: 

Data bank : EMBL-NEW 10, all entries 

Data bank : GenBank 85, all entries 

Data bank : GenBank-NEW 10, all entries 

Data bank : HIV-NA 7, all entries 

Data bank : IssuedJMA , all entries 

Data bank : N-GeneSeq 16.3, all entries 

Data bank : UEMBL 4©_85, all entries 





Data bank : VectorBank 9, all entries 



PARAMETERS 



Similarity matrix 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cutoff score 
Random i zat i on group 


Unitary 
1 

1.00 

0. 33 
1 

0 


K-tuple 4 
Joining penalty 30 
Window size 14 


Initial scores to save 30 
Optimized scores to save 30 


Alignments to save 30 
Display context 100 




SEARCH STATISTICS 


Scores : 


Mean 
14 


Median Standard Deviation 
15 0.78 


Times : 


CPU 
00:01:03.97 


Total Elapsed 
00:01:08.00 


Number of residues: 
Number of sequences 


opt imized : 


£3291943 
4881 



The scores below are sorted by opt imized score. 
Significance is calculated based on optimized score. 

A 100# identical sequence to the query sequence was not found. 
The list of best scores is: 

Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



***# 5 standard deviations above mean 



1. 


BTRPTDNflE 


B. taurus repeat region DNft. 


482 


18 


18 


5. 


15 


0 


2. 


MUSMft 


Mouse raRNft for ORF. 


7222 


16 


18 


5. 


15 


0 


3. 


S92205 


rnal2+=pre-rRNA maturation CS 


3587 


15 


18 


5. 


15 


0 


4. 


ZEFTRflN 


Danio rerio Tcl-like transpos 


1205 


16 


18 


5. 


15 


0 






**** 3 standard deviations above mean 


***# 










5. 


HSflfiftCMHG 


H. sapiens putatively transcr 


£00 


17 


17 


3. 


86 


0 


6. 


TBILTA124 


T. brucei mRNfl for variant sur 


1&88 


17 


17 


3. 


86 


0 


7. 


ATTS1638 


ft. thaliana transcribed seque 


274 


17 


17 


3. 


86 


0 


a. 


TBRVSG 


T. brucei rhodensiense mRNft fo 


1732 


17 


17 


3. 


86 


0 


9. 


S52562 


LH-2=LIM/homeodoraain protein 


2072 


17 


17 


3. 


86 


0 


10. 


HUMSWX167 


Human chromosome X STS sWXDlfe 


239 


16 


17 


3. 


86 


0 


11. 


Q77574 


Human genome fragiaent. (Prefe 


£00 


17 


17 


3. 


86 


0 


12. 


U01312 


Streptococcus pyogenes JRS4 p 


1823 


17 


17 


3. 


86 


0 


13. 


RflBTCRGflM 


Rabbit T-cell receptor gamma 


147 


17 


17 


3. 


86 


0 


14. 


T16193 


183780 Homo sapiens cDNft 3' en 


498 


15 


17 


3. 


86 


0 


15. 


ZEFTRAND 


Danio rerio Tcl-like transpos 


1241 


15 


17 


3. 


86 


0 


16. 


SSIS1139 


B. salivarius insertion sequen 


1717 


15 


17 


3. 


86 


0 


17. 


YSKSTE12X 


Kluy veromyces lactis STE12 ge 


2678 


15 


17 


3. 


86 


0 



4 n 

18. 


LcZLo4 


Caenorhabd i t i s elegans cosmid 


7QQCC 


1 R 
1%J 


1 7 
1 f 




\7i 


19. 


CLZLo4 


Caenorhabdi t i s elegans cosmid 






1 / 


i3« OD 


)7l 


20. 


CEZC84 


Caenorhabd it i s elegans cosmid 


7QQCC 

tiB9DD 


lJ 


1 "7 


"? Q£. 
O. OD 


Oi 
W 


dl. 




Figure 1. (B) Sequences in wt 


IT * 

51 


14 


1 / 






22. 


(2^8699 


Oligonucleotide 7 to insert g 


63 


4 A 

14 


4 —f 

17 


3. 86 




23. 


i / tut k it / rtp 

SV4MNKR5 


simian virus 40/african green 


1 15 


4 A 

14 


17 


3. 86 


0 


24. 


HSBA7H05d 


H. sapiens partial cDNA seque 


231 


14 


17 


3. 86 


0 


£5. 


pi t A M K 1 U O A 

SV4MNKR4 


simian virus 40/african green 




14 


1 "7 

1 / 


"? DC 

J>. ob 


W 


Sfc. 


SV4STA 


Rhesus macaque polyoma virus 


384 


14 


17 


3.86 


0 


£7. 


SV4MNKR3 


simian virus 40/african green 


593 


14 


17 


3. 86 


0 


28. 


SV4STA4 


Rhesus macaque polyoma virus 


694 


14 


17 


3.86 


0 


E9. 


HUMRAB6A 


Homo sapiens GTP-binding prot 


740 


14 


17 


3. 86 


0 


38. 


HSRAB6A 


Homo sapiens GTP-binding prot 


74© 


14 


17 


3.86 


0 



1. CL16' (1-21) 

BTRPTDNAE B. taurus repeat region DNA. 



LOCUS 

DEFINITION 
ACCESSION 
KEYWORDS 
SOURCE 
ORGANISM 



BTRPTDNflE 482 bp DNA 

B. taurus repeat region DNA. 
225529 



MAM 



16-AUG-1993 



repeat region, 
cattle. 
Bos taurus 

Eukaryota; Aniraalia; Metazoa; Chordata; Mertebrata; Mammalia; 

Theria; Eutheria; Art iodactyla; Ruminantia; Pecora; Bovidae. 
REFERENCE 1 (bases 1 to 482) 

AUTHORS Szemraj,J. , Plucienniczak, 6. , Jaworski,J. and Plucienniczak, A. 
TITLE Evidence for horaological recombination with participation of the 

bovine alu-like sequences 
JOURNAL Unpublished 
STANDARD full automatic 
REFERENCE 2 (bases 1 to 482) 
AUTHORS Plucienniczak, A. 
TITLE Direct Submission 

JOURNAL Submitted ( 12-AUG-1993) PLUCIENNICZAK A., PP TERPOL, LABORATORY OF 

GENETIC ENGINEERING, P.O.W. 57, SIERADZ, POLAND, 98-20© 
STANDARD full automatic 
COMMENT NCBI gi: 396758 
FEATURES Locat i on/Qual i f i er s 

source 1. . 482 

/organism= n Bo5 taurus" 
/clone="pUJ3. 24" 
/dev^stage^'calf" 
/t is 5ue_type= n thymus " 
repeat_unit 133. . 482 

/partial 

/note="Truncated 5' part of BDDF. " 
/rpt „type=DISPERSED 
/evidence=experi mental 

/rpt_fami ly="Bovine Dimer Driven Family (BDDF)" 
/label=BDDF 
/citat ion=Cl] 
repeat_unit 373. . 426 

/partial 

/note= l, 5 1 part of bovine alu-like monomer." 
/r pt _t y pe=FLANK I NG 



/evidence=experi mental 

/rpt_fami ly="bovine alu-like" 

/citat ion=Cl] 
BASE COUNT 135 a 109 c 124 g 114 t 

ORIGIN 

Initial Score = 18 Optimized Score = 18 Significance = 5.15 

Residue Identity - 85* Matches = 18 Mismatches = 3 

Gaps = 0 Conservative Substitutions = 0 

GGGTCGftTGGTGGftGftGGTCGTGftCGftGfiftTGTftGTCCACTGGftGftftGGGftftTGGCflftftCTPCTTCAGTflTT 
340 350 360 370 380 390 400 

X 10 £0 

AAAAGTGCAAAAGCCTAG6AC 
I I I I I I I 11 I 1 I I 1 M I I 
CTTGCCTTGAGAACCCCATGAACGTATGAAflAGGGCAflAflGCATAGGATflGCTGAAAGAGGAACTCCCCAGT 
410 420 430 X 440 450 X 460 470 



CGATA6G 
480 



) 0 < 

01 10 IntelliGenetics 
> 0 < 

FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file janllf. res made by on Wed 11 Jan 95 12:47:30-PST. 



Query sequence being compared: 
Number of sequences searched: 
Number of scores above cutoff: 



CL26' 



(1-21) 
30£5©7 
4881 



Results of the initial comparison of CL26* (1-21) with: 



Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 



EMBL-NEW 10, all entries 
GenBank 85, all entries 
GenBank-NEW 10, all entries 
HIV-NA 7, all entries 
IssuedJvIA , all entries 
N-GeneSeq 16.3, all entries 
UEMBL 40_85, all entries 
VectorBank 9, all entries 



PARAMETERS 



Similarity matrix Unitary 

Mismatch penalty 1 

Gap penalty 1,00 

Gap size penalty 0.33 

Cutoff score 1 

Randomization group 0 



K-tuple 

Joining penalty 
Window size 



4 
30 
14 



Initial scores to save 
Optimized scores to save 



30 Alignments to save 30 
30 Display context 100 

SEARCH STATISTICS 



Scores: 



Mean 
6 



Median Standard Deviation 
7 3.01 



Times : 



CPU 
00: 13:03. 06 



Total Elapsed 
00:13:28.00 



Number of residues: 276734581 
Number of sequences searched: 302507 
Number of scores above cutoff: 4881 



Cut-off raised to 4. 

Cut-off raised to 5. 

Cut-off raised to 6. 

Cut-off raised to 7. 

Cut-off raised to 8. 

Cut-off raised to 9. 

Cut-off raised to 10. 

Cut-off raised to 11. 




Cut-off raised to 12. 
Cut-off raised to 13. 
Cut-off raised to 14. 

The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

A 100* identical sequence to the query sequence was not found. 



The list of best scores is: 

Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



**** 3 standard deviations above mean 



1. 


BTRPTDNAE 


B. taurus repeat region DNA. 


48£ 


18 


18 


3.99 


0 


£. 


RABTCRGAM 


Rabbit T-cell receptor gamma 


147 


17 


17 


3.65 


0 


3. 


Q77574 


Human genome fragment. (Prefe 


£00 


17 


17 


3.65 


0 


4. 


HSAAACMHG 


H. sapiens putatively transcr 


£00 


17 


17 


3.65 


0 


5. 


ATTS1638 


A. thaliana transcribed seque 


£74 


17 


17 


3. 65 


0 


6. 


TBILTA1£4 


T.brucei mRNA for variant sur 


1688 


17 


17 


3.65 


0 


7. 


TBRVSG 


T. brucei rhodensiense mRNA fo 


173£ 


17 


17 


3. 65 


0 


8. 


US131£ 


Streptococcus pyogenes JRS4 p 


18£3 


17 


17 


3.65 


0 


9. 


S5£56£ 


LH-£=LIM/homeodomain protein 


£07£ 


17 


17 


3.65 


0 


10. 


HUMSWX167 


Human chromosome X STS sWXDlfe 


239 


16 


17 


3.3£ 


0 


11. 


GCREG35 


Galago Alu repeat type II, GA 


£45 


16 


16 


3. 32 


0 


12. 


HSA127WB5 


H. sapiens (D1S505) DNA segrae 


319 


16 


16 


3. 32 


0 


13. 


NEUMTOLI£ 


N. crassa mitochondrial oli£ 


335 


16 


16 


3. 32 


0 


14. 


PLYORIA 


Human polyomavirus BK (strain 


375 


16 


16 


3. 32 


0 


15. 


M88B10 


CEL01E12 Caenorhabditis elega 


394 


16 


16 


3.3£ 


0 


16. 


S675£3 


early gene, late gene <contro 


401 


16 


16 


3.3£ 


0 


17. 


ATTS2283 


A. thaliana transcribed seque 


408 


16 


16 


3. 32 


0 


18. 


HS311VF9 


H. sapiens (D5S662) DNA segme 


414 


16 


16 


3.32 


0 


19. 


PLYORIB 


Human polyomavirus BK (strain 


424 


16 


16 


3. 32 


0 


£0. 


flTTS188£ 


A. thaliana transcribed seque 


429 


16 


16 


3.32 


0 


£1. 


PVBECR5££ 


Human papovavirus BK, Gardner 


455 


16 


16 


3. 32 


0 


££. 


SYNECR530 


BKV hybrid (tr-530) early tra 


487 


16 


16 


3.32 


0 


£3. 


SYNECR53£ 


BKV hybrid (tr-53£) early tra 


515 


16 


16 


3.3£ 


0 


£4. 


SYNECR531 


BKV hybrid (tr-531) early tra 


558 


16 


16 


3.32 


0 


£5. 


PVBECR501 


Human papovavirus BK, Gardner 


559 


16 


16 


3. 32 


0 


£6. 


CEZMTTGP 


Green turtle mitochondrion tr 


620 


16 


16 


3.32 


0 


£7. 


Q58456 


BK enhancer-adenovirus-£ late 


642 


16 


16 


3. 32 


0 


£8. 


Q54£10 


BK enhance) — adenovirus £ late 


64£ 


16 


16 


3.3£ 


0 


£9. 


HUMRPO 


Human gene for ret proto-onco 


678 


16 


16 


3. 32 


0 


30. 


ZEFTRANB 


Danio rerio mRNA, Tcl-like tr 


706 


16 


16 


3.3£ 


0 



Query sequence being compared: CL26' (1-21) 

Number of sequences optimized: 4881 

Results of the optimized comparison of CL26' (1-21) with: 

Data bank : EMBL-NEW 10, all entries 

Data bank : GenBank 85, all entries 

Data bank : GenBank-NEW 10, all entries 

Data bank : HIV-Nft 7, all entries 

Data bank : IssuedJMft , all entries 

Data bank : N-GeneSeq 16.3, all entries 



Data bank : UEMBL 40„85 ? all entries 
Data bank : VectorBank 9 9 all entries 



PARAMETERS 



Similarity matrix Unitary 

Mismatch penalty 1 

Gap penalty 1.80 

Gap size penalty 0. 33 

Cutoff score 1 

Random izat ion group 0 



K-tuple 

Joining penalty 
Window size 



4 
3® 
14 



Initial scores to save 30 
Opt imized scores to save 30 



Alignments to save 30 
Display context 100 



SEARCH STATISTICS 



Scores: 



Mean Median Standard Deviation 

14 15 0.78 



Timess 



CPU 
80:01:01.91 



Total Elapsed 
00:01 :09. 0© 



Number of residues: £3291943 
Number of sequences optimized: 4881 



The scores below are sorted by optimized score. 
Significance is calculated based on optimized score. 

ft 108% identical sequence to the query sequence was not found. 



The 1 i st of best scores is: 

Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



#### 5 standard deviations above mean 



1. 


BTRPTDNAE 


B. taurus repeat region DNA. 


482 


18 


18 


5. 


15 


0 


2. 


MUSMA 


Mouse mRNA for ORF. 


7222 


1& 


18 


5. 


15 


0 


3. 


S92205 


rnal2+=pre-rRNA maturation CS 


3587 


15 


18 


5. 


15 


0 


4. 


ZEFTRAN 


Danio rerio Tel-like transpos 


1205 


1& 


18 


5. 


15 


0 






**** 3 standard deviations i 


above mean 


**** 










5. 


HSAAACMHG 


H. sapiens putatively transcr 


200 


17 


17 


3. 


86 


0 


6. 


TBILTA124 


T.brucei nRNA for variant sur 


1&88 


17 


17 


3. 


8b 


0 


7. 


ATTS1638 


A. thaliana transcribed seque 


274 


17 


17 


3. 


8b 


0 


B. 


TBRVS6 


T.brucei rhodensiense nRNA fo 


1732 


17 


17 


3. 


8b 


0 


9. 


S52562 


LH-2=LIM/homeodoraain protein 


2072 


17 


17 


3. 


8b 


0 


10. 


HUMSWX1&7 


Human chromosome X STS sWXDlfe 


239 


1& 


17 


3. 


8b 


0 


11. 


Q77574 


Human genome fragment. <Prefe 


200 


17 


17 


3. 


8b 


0 


12. 


U01312 


Streptococcus pyogenes JRS4 p 


1823 


17 


17 


3. 


8b 


0 


13. 


RABTCR6AM 


Rabbit T-cell receptor gamma 


147 


17 


17 


3. 


8b 


0 


14. 


T1&193 


IB370® Homo sapiens cDNA 3' en 


498 


15 


17 


3. 


8b 


0 


15. 


ZEFTRAND 


Danio rerio Tcl-like transpos 


1241 


15 


17 


3. 


8b 


0 


16. 


SSIS1139 


S. salivarius insertion sequen 


1717 


15 


17 


3. 


8b 


0 



17. 


YSKSTE12X 


Kluy verorayces lactis STE12 ge 


2678 


15 


17 


3.86 


0 


18. 


CEZC84 


Caenorhabdit is elegans cosmid 


38955 


15 


17 


3. 86 


0 


19. 


CEZC84 


Caenorhabdit is elegans cosmid 


38955 


15 


17 


3.86 


0 


£0. 


CEZC84 


Caenorhabdit i s elegans cosinid 


38955 


15 


17 


3. 86 


0 


ei. 


M28728 


Figure 1. (B) Sequences in wt 


51 


14 


17 


3. 86 


0 


22. 


Q38699 


Oligonucleotide 7 to insert g 


63 


14 


17 


3.86 


0 


23. 


SV4MNKR5 


simian virus 40/african green 


115 


14 


17 


3.86 


0 


24. 


HSBA7H052 


H. sapiens partial cDNA seque 


231 


14 


17 


3. 86 


0 


25. 


SV4MNKR4 


simian virus 40/african green 


250 


14 


17 


3. 86 


0 


26. 


SV4STA 


Rhesus macaque polyoma virus 


384 


14 


17 


3. 86 


0 


27. 


SV4MNKR3 


simian virus 40/african green 


593 


14 


17 


3.86 


0 


28. 


SV4STA4 


Rhesus macaque polyoma virus 


694 


14 


17 


3.86 


0 


29. 


HUMRAB6A 


Homo sapiens GTP-binding prot 


740 


14 


17 


3.86 


0 


30. 


HSRAB6A 


Homo sapiens GTP-binding prot 


740 


14 


17 


3.86 


0 



1. CL26' (1-21) 

BTRPTDNAE B. taurus repeat region DNA. 



LOCUS 

DEFINITION 
ACCESSION 
KEYWORDS 
SOURCE 
ORGANISM 



BTRPTDNAE 482 bp DNA 

B. taurus repeat region DNA. 
Z25529 



NAM 



16-AUG-1993 



repeat region, 
cattle. 
Bos taurus 

Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; 
Theria; Eutheria; Art iodacty la; Ruminantia; Pecora; Bovidae. 
REFERENCE 1 (bases 1 to 482) 

AUTHORS Szemraj,J., Plucienniczak, G. , Jaworski,J. and Plucienniczak, A. 
TITLE Evidence for homological recombination with participation of the 

bovine alu-like sequences 
JOURNAL Unpublished 
STANDARD full automatic 
REFERENCE 2 (bases 1 to 482) 
AUTHORS Plucienniczak, A. 
TITLE Direct Submission 

JOURNAL Submitted < 12-AUG-1993) PLUCIENNICZAK A., PP TERPOL, LABORATORY OF 

GENETIC ENGINEERING, P.O.W. 57, SIERADZ, POLAND, 98-200 
STANDARD full automatic 
COMMENT NCBI gi: 396758 

FEATURES Locat i on/Qual i f i ers 

source 1. . 482 

/organism="Bos taurus" 
/clone= M pUJ3.24" 
/dev_stage="calf " 
/tissue_type=" thymus" 
repeat_unit 133. . 482 

/partial 

/note="Truncated 5 1 part of BDDF. " 
/ r pt _t y p e=D I SPE RSED 
/evidence=experi mental 

/rpt_family="Bovine Dimer Driven Family (BDDF) " 
/label=BDDF 
/citat ion=C13 
repeat _un it 373. . 426 

/partial 

/note^'S' part of bovine alu-like monomer." 



/rpt _type=FLANKING 

/evidence=experi mental 

/rpt_f ami ly="bovine alu-like" 

/citat ion=C13 
BASE COUNT 135 a 109 c 1£4 g 114 t 

ORIGIN 

Initial Score = 18 Optimized Score = IS Significance = 5.15 

Residue Identity = 85% Matches = 18 Mismatches = 3 

Gaps = 0 Conservative Substitutions = 0 

GGGTCGflTGGTGGftGflGGTCGTGftCGftGfiftTGTftGTCCftCTGGftGftAGGGfiftTGGCftftftCTftCTTCftGTATT 
340 350 360 370 380 39© 400 

X 10 20 

AAAAGT6CAAAAGCCTAGGAC 
Mill I It t 11 I t Mill 
CTTGCCTTGAGAACCCCATGAACGTATGAAAAGGGCAAAA6CATAGGATAGCTGAAAGAGGAACTCCCCAGT 
410 420 430 X 440 450 X 460 470 

CGATAGG 
480 




> 0 < 

□I 10 Intell iGenet ics 

> 0 ( 

FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file janllf. res made by on Wed 11 Jan 95 12: 47:30-PST. 



Query sequence being compared: 
Number of sequences searched: 
Number of scores above cutoff: 



CL26' 



(1-21) 

302507 
4881 



Results of the initial comparison of CL26 1 (1-21) with: 



Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 
Data bank 



EMBL-NEUi 10, all entries 
GenBank 85, all entries 
GenBank-NEW 1®, all entries 
HIV-NA 7, all entries 
IssuedJviA , all entries 
N-GeneSeq 16.3, all entries 
UEMBL 40_85, all entries 
VectorBank 9, all entries 



PARAMETERS 



Similarity matrix Unitary 

Mismatch penalty 1 

Gap penalty 1.00 

Gap size penalty 0.33 

Cutoff score 1 

Randomization group 0 



K-tuple 

Joining penalty 
Window size 



4 
3© 
14 



Initial scores to save 
Optimized scores to save 



30 Alignments to save 30 
30 Display context 100 

SEARCH STATISTICS 



Scores : 



Mean Median Standard Deviation 

6 7 3.01 



Times : 



CPU 
13:03. 06 



Total Elapsed 
00: 13:28. 00 



Number of residues: 276734581 
Number of sequences searched: 302507 
Number of scores above cutoff: 4881 



Cut-off raised to 4. 

Cut-off raised to 5. 

Cut-off raised to 6. 

Cut-off raised to 7. 

Cut-off raised to 8. 

Cut-off raised to 9. 

Cut-off raised to 10. 

Cut-off raised to 11. 



Cut-off raised to 12. 
Cut-off raised to 13. 
Cut-off raised to 14. 



The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

ft i®8# identical sequence to the query sequence was not found. 
The list of best scores is: 

Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



**** 3 standard deviations above mean **** 



1. 


BTRPTDNAE 


B. taurus repeat region DNA. 


482 


18 


18 


3.99 


0 


2. 


RABTCRGAM 


Rabbit T-cell receptor gamma 


147 


17 


17 


3.65 


0 


3. 


Q77574 


Human genome fragment. (Prefe 


200 


17 


17 


3.65 


0 


4. 


HSAAACMHG 


H. sapiens putatively transcr 


200 


17 


17 


3.65 


0 


5. 


ATTS1638 


A. thaliana transcribed seque 


274 


17 


17 


3. 65 


0 


6. 


TBILTA124 


T. brucei raRNA for variant sur 


1688 


17 


17 


3.65 


0 


7. 


TBRVSG 


T. brucei rhodensiense raRNA fo 


1732 


17 


17 


3. 65 


0 


a. 


US1312 


Streptococcus pyogenes JRS4 p 


1823 


17 


17 


3. 65 


0 


9. 


S525&2 


LH-2=LIM/hoineodoBiain protein 


£072 


17 


17 


3. 65 


0 


10. 


HUMSWX167 


Human chromosome X STS sWXD16 


239 


16 


17 


3.32 


0 


n. 


BCREG35 


Gal ago Alu repeat type II, GA 


245 


16 


16 


3.32 


0 


12. 


HSA127WB5 


H. sapiens (D1S505) DNA segme 


319 


16 


16 


3.32 


0 


13. 


NEUMT0LI2 


N. crassa mitochondrial oli2 


335 


16 


16 


3. 32 


0 


14. 


PLYORIA 


Human polyoroavirus BK (strain 


375 


16 


16 


3.32 


0 


15. 


M88810 


CEL01E12 Caenorhabditis elega 


394 


16 


16 


3.32 


0 


16. 


S&7523 


early gene, late gene <contro 


401 


16 


16 


3. 32 


0 


17. 


ATTS2283 


A. thaliana transcribed seque 


408 


16 


16 


3. 32 


0 


18. 


HS311VF9 


H. sapiens (D5S&62) DNA segme 


414 


16 


16 


3.32 


0 


19. 


PLYORIB 


Human polyomavirus BK (strain 


424 


16 


16 


3.32 


0 


20. 


ATTS1882 


A. thaliana transcribed seque 


429 


16 


16 


3.32 


0 


21. 


PVBECR522 


Human papovavirus BK, Gardner 


455 


16 


16 


3.32 


0 


22. 


SYNECR530 


BKV hybrid (tr-530) early tra 


487 


16 


16 


3.32 


0 


23. 


SYNECR532 


BKV hybrid (tr-532) early tra 


515 


16 


16 


3.32 


0 


24. 


SYNECR531 


BKV hybrid (tr-531) early tra 


558 


16 


16 


3.32 


0 


£5. 


PVBECR501 


Human papovavirus BK, Gardner 


559 


16 


16 


3. 32 


0 


26. 


CEZMTTGP 


Green turtle mitochondrion tr 


6£0 


16 


16 


3.32 


0 


27. 


Q5845& 


BK enhance* — adenovirus-2 late 


642 


16 


16 


3.32 


0 


28. 


054210 


BK enhancer-adenovirus 2 late 


642 


16 


16 


3.32 


0 


29. 


HUMRPO 


Human gene for ret proto-onco 


678 


16 


16 


3. 32 


0 


30. 


ZEFTRANB 


Danio rerio raRNA, Tcl-like tr 


706 


16 


16 


3.32 


0 



Query sequence being compared: CL26' (1-£1) 

Number of sequences optimized: 4881 

Results of the optimized comparison of CL26* (l-£i) with: 

Data bank : EMBL-NEW 10, all entries 

Data bank : GenBank 85, all entries 

Data bank : GenBank-NEW 1®, all entries 

Data bank : HIV-Nft 7, all entries 

Data bank : Issued JMA , all entries 

Data bank : N-GeneSeq 16.3, all entries 



Data bank : UEMBL 40_85, all entries 
Data bank : VectorBank 9, all entries 



Similarity matrix 
Mismatch penalty 
Sap penalty 
Gap size penalty 
Cutoff score 
Random izat ion group 



PARAMETERS 



Unitary 
1 

1.00 

0.33 
1 



Initial scores to save 
Optimized scores to save 



30 
30 



K-tuple 

Joining penalty 
Window size 



SEARCH STATISTICS 



4 
30 
14 



Alignments to save 30 
Display context 100 



Scores : 



Mean Median Standard Deviation 

14 15 0.78 



Tiroes: 



CPU 
00:01:01.91 



Total Elapsed 
00:01 :09.00 



Number of residues: 23291943 
Number of sequences optimized: 4881 



The scores below are sorted by optimized score. 
Significance is calculated based on optimized score. 

A 100% identical sequence to the query sequence was not found. 



The list of best scores is: 

Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



**** 5 standard deviations above mean **** 



1. 


BTRPTDNAE 


B. taurus repeat region DNA. 


482 


18 


18 


5. 


15 


0 


2. 


MUSMA 


Mouse raRNA for ORF. 


7222 


16 


18 


5. 


15 


0 


3. 


S92205 


rnal2+=pre-rRNA maturation CS 


3587 


15 


18 


5. 


15 


0 


4. 


Z EFT RAN 


Danio rerio Tcl-like transpos 


1265 


16 


18 


5. 


15 


0 






**** 3 standard deviations above mean 


**** 










5. 


HSAAACMHG 


H. sapiens putatively transcr 


200 


17 


17 


3. 


86 


0 


6. 


TBILTA124 


T. brucei ibRNA for variant sur 


1688 


17 


17 


3. 


86 


0 


7. 


ATTS1638 


A. thaliana transcribed seque 


274 


17 


17 


3. 


86 


0 


8. 


TBRVSG 


T. brucei rhodensiense raRNA fo 


1732 


17 


17 


3. 


86 


0 


9. 


S52562 


LH-2=LIM/homeodomain protein 


2072 


17 


17 


3. 


86 


0 


10. 


HUMSWX167 


Human chromosome X STS sWXDlfc 


239 


16 


17 


3. 


86 


0 


11. 


Q77574 


Human genome fragment. (Prefe 


200 


17 


17 


3. 


86 


0 


12. 


U01312 


Streptococcus pyogenes JRS4 p 


1823 


17 


17 


3. 


86 


0 


13. 


RABTCRGAM 


Rabbit T-cell receptor gamma 


147 


17 


17 


3. 


86 


0 


14. 


T16193 


IB3780 Homo sapiens cDNA 3' en 


498 


15 


17 


3. 


86 


0 


15. 


ZEFTRAND 


Danio rerio Tcl-like transpos 


1241 


15 


17 


3. 


86 


0 


16. 


SSIS1139 


S. salivarius insertion sequen 


1717 


15 


17 


3. 


86 


0 



4 T 

17. 


YSKSTE12X 


Kl uy veronyces lactis STE1S ge 


Lib / o 


1 c 


1 *7 

X f 


O. OD 


Oi 
V? 


18. 


CEZC84 


Caenorhabdit i s elegans cosmid 


n q cr cr 


1 J 


i *7 
I f 


ti. DO 




19. 


CEZC84 


Caenorhabditis elegans cosmid 


38955 


15 


1 "7 
1 f 






2@. 


CEZC84 


Caenorhabditis elegans cosmid 


^8955 


4 cr 

15 


17 


Am 86 


0 


21. 


M287d8 


Figure 1. <B) Sequences in wt 


51 


14 


17 


3. 86 




22. 


Q38699 


Oligonucleotide 7 to insert g 


S3 


14 


17 


3. 86 


0 


23. 


SV4MNKR5 


simian virus 40/african green 


115 


4 /. 

14 


4 — y 

17 


^. 86 


rs 

0 


24. 


HSBA7H05d 


H. sapiens partial cDNA seque 


2^1 


14 


4 T 

17 


86 


0 


3C 


CI 1 AMKIl/ D A 


simian virus 40/african green 




1 A 
1 H 


1 7 
1 / 


o. OD 




26. 


SV4STA 


Rhesus macaque polyoma virus 


384 


14 


17 


3.86 


0 


£7. 


SV4MNKR3 


simian virus 40/african green 


593 


14 


17 


3.86 


0 


£8. 


SV4STA4 


Rhesus macaque polyoma virus 


694 


14 


17 


3.86 


0 


S9. 


HUMRAB6A 


Homo sapiens GTP-binding prot 


740 


14 


17 


3.86 


0 


30. 


HSRAB6A 


Homo sapiens STP-binding prot 


740 


14 


17 


3. 86 


0 



1. CL26' (1-21) 

BTRPTDNAE B. taurus repeat region DNA. 



LOCUS 

DEFINITION 
ACCESSION 
KEYWORDS 
SOURCE 
ORGANISM 



BTRPTDNAE 482 bp DNA 
B. taurus repeat region DNA. 
Z25529 



MAM 



16-AUG-1993 



repeat region, 
catt le. 
Bos taurus 

Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; 

Theria; Eutheria; Art iodactyla; Ruminantia; Pecora; Bovidae. 
REFERENCE 1 (bases 1 to 482) 

AUTHORS Szemraj ? J. , Pluci enniczak, G. , Jaworski,J. and Plucienniczak, A. 
TITLE Evidence for homological recombination with participation of the 

bovine alu-like sequences 
JOURNAL Unpublished 
STANDARD full automatic 
REFERENCE 2 (bases 1 to 482) 
AUTHORS Plucienniczak, A. 
TITLE Direct Submission 

JOURNAL Submitted ( 12-AUG-1993) PLUCIENNICZAK A., PP TERPQL, LABORATORY OF 

GENETIC ENGINEERING, P. 0. W. 57, SIERADZ, POLAND, 98-200 
STANDARD full automatic 
COMMENT NCBI gi: 396758 
FEATURES Locat i on/Qual i f i ers 

source 1..482 

/organism="Bos taurus" 
/clone= n pUJ3.24 M 
/dev_5tage= M calf n 
/t issue_ty pe=" thymus" 
repeat_unit 133. . 482 

/part ial 

/note="Truncated 5' part of BDDF. " 
/rpt_type=DISPERSED 
/evidence=experi mental 

/rpt_family="Bovine Dimer Driven Family (BDDF)" 
/label=BDDF 
/citat ion-[13 
repeat_unit 373. . 426 

/partial 

/note="5' part of bovine alu-like monomer." 



/rpt_type=FLftNKING 

/evidence=experi mental 

/rpt_faiaily="bovine alu-like" 

/citation=C13 
BASE COUNT 135 a 109 c l£4 g 114 t 

ORIGIN 

Initial Score = 18 Optimized Score = 18 Significance = 5.15 
Residue Identity = 85% Matches = 18 Mismatches = 3 

Gaps = 0 Conservative Substitutions = 0 

GGGTCGflTGGTGGfiGfiGGTCGTGflCGOGflflTGTftGTCCflCTGGfiGflflGGGfiftTGGCflfifiCTftCTTCflGTfiTT 
340 35© 3&0 370 380 390 400 

X 10 £0 

fiflflflGTGCflftftflGCCTflGGfiC 
1 I I I I I I I I I I I I I I I I I 
CTTGCCTTGflGAftCCCCflTGflflCGTftTGftflflflGGGCflflftflGCflTfiGGftTflGCTGftflftGflGGflflCTCCCCflGT 
410 4S0 430 X 440 450 X 460 470 

CGflTfiGG 

Mr 



